DBN based multi-stream models for speech
نویسندگان
چکیده
We propose dynamic Bayesian network (DBN) based synchronous and asynchronous multi-stream models for noise-robust automatic speech recognition. In these models, multiple noise-robust features are combined into a single DBN to obtain better performance than any single feature system alone. Results on the Aurora 2.0 noisy speech task show significant improvements of our synchronous model over both single stream models and over a ROVER based fusion method.
منابع مشابه
Photo-realistic visual speech synthesis based on AAM features and an articulatory DBN model with constrained asynchrony
This paper presents a photo realistic visual speech synthesis method based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN) in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/velum, can be controlled. Perceptual linear prediction (PLP) features from the audio speech and active appearance model (AAM) features from mouth ...
متن کاملAn Investigation of Different Modeling Techniques for Multi-modal Event Classification in Meeting Scenarios
In this work a hidden Markov model (HMM) and a multistream HMM are compared with a new dynamic Bayesian network (DBN) approach for multi-modal event classification in meeting scenarios. A set of 60 meetings each with four participants has been recorded at IDIAP [1]. Given segments of these meetings have been categorized to one of ten different states: consensus, disagreement, discussion, monolo...
متن کاملDynamic Bayesian Networks for Multi-Dialect Isolated Arabic Recognition
Hidden Markov Models (HMM) are currently widely used in Automatic Speech Recognition (ASR) as being the most effective models. In addition, the HMM are just a special case of graphical models which are dynamic Bayesian Networks (DBN). These are modeling tools more sophisticated because they allow to include several specific variables in the problem of automatic speech recognition other than the...
متن کاملRoles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition
Recently, deep learning techniques have been successfully applied to automatic speech recognition tasks -first to phonetic recognition with context-independent deep belief network (DBN) hidden Markov models (HMMs) and later to large vocabulary continuous speech recognition using context-dependent (CD) DBN-HMMs. In this paper, we report our most recent experiments designed to understand the role...
متن کاملSpeech Attribute Detection Using Deep Learning
In this work we present alternative models for attribute speech feature extraction based on the two state-of-the-art deep neural networks: convolutional neural networks (CNN) and feed-forward neural network with pretraining using stack of restricted Boltzmann machines (DBN-DNN). These attribute detectors are trained using data-driven approach across all languages in the OGI-TS multi-language te...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003